perm filename GALLAG.LE1[ESS,JMC]1 blob sn#059034 filedate 1973-08-19 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	Dear Mr. Gallagher:
C00008 ENDMK
CāŠ—;
Dear Mr. Gallagher:

	In accordance with  our conversations, this is a  request for
a grant  of $10,000 from the Associated  Press to Stanford University
for  an  approximately  one  year  study  of  improved   methods  for
maintaining, automatically  indexing, and  using a  computerized news
data  base.  A major goal of the work  will be to see if a completely
automatic  storage and  indexing  system  can provide  a  useful  and
cost-effective service.

	It seems to me that  this can further your interest in having
an automatic  library  for  the  A.P.   staff  and  our  interest  in
experimenting with services for home, university and business use.

	Our present  ideas for  improving our  A.P.   news data  base
include the following:

	1. Index  on all non-trivial words in  the stories.  We think
this will not be too expensive in computer time and storage,   but we
aren't sure yet.

	2.  Allow  category names that catch all  occurences of words
in  the category,   e.g. the category "animal"  would collect stories
containing the words "dog", "cat", etc.

	3.  Catch  inflected forms of  words,  e.g.   a reference  to
"senator" would catch the word "senators".

	4.  Increase  the size of the data base.   This is important,
because bigger data  bases will require  better filtering methods  in
order to  search them  usefully.   At Mr.   Bowen's suggestion  which
seems  quite  reasonable  to  us,    we  plan  to  store  the  A-wire
continuously for perhaps a  year. This will require substantial  disk
storage; a year of A-wire will  cost about $750 per month rental from
IBM to  store.  Devising methods of eliminating redundant information
may reduce this requirement.

	For other  experiments,   it may  be worthwhile  to put  some
other  wires  into the  machine.   We  have  in  mind the  California
regional wire, Dataspeed, and the B-wire.

	5.   Study  the  utility  of  the  system.    Our  subjective
impression and the  impression gained from  ARPA net use is  that the
system is  useful in keeping up with a days  news even in its present
form.  However, it really  requires more formal testing.  We  suggest
the following:

	  a.   A.P.   may wish  to make  it available  internally for
experimental  use and for  suggestions.  You  should equip yourselves
with a  30  character per  second  printer like  that made  by  Texas
Instruments  and  prepare  to  suffer  a  substantial  long  distance
telephone bill.   If competition with  other users of  our two  input
ports proves  annoying,   a  priority port  can be  provided at  some
expense.

	  b.    Stanford  faculty  in  communications  and  political
science departments  that  use  current  news may  be  able  to  help
evaluate the system if we provide a terminal for them.

	 c.   If  you approve,   we  would like  to test  the systems
utility  to a business firm.  I have  in mind the headquarters of the
Bank of America in San  Francisco,  because I know someone there.   I
haven't discussed it  with them,  but it would  be interesting to see
if they would find  it worth (say)  $175 per month  plus the cost  of
renting a suitable terminal.

	It  is expected  that the  work  will lead  to  a PhD  thesis
covering  this and  other matters  and that a  report of  the results
will be published in some computer journal.

	Our proposed budget is as follows:





	This proposal  has  the  approval of  the  Administration  of
Stanford University.